3.2.20. DNS - Domain Name System

Users prefer to refer to hosts, mailboxes, and other resources not by their binary network addresses but using some ASCII strings, such as tana@art.ucsb.edu. Nevertheless, the network itself only understands binary addresses, so some mechanism is required to convert the ASCII string to network addresses. Below we will describe how this mapping is accomplished in the Internet.

The mapping is done by DNS (the Domain Name System).

The essence of DNS is a hierarchical, domain-based naming scheme and a distributed database system for implementing this naming scheme. It is primarily used for mapping host names and email destinations to IP addresses but can also be used for other purposes. DNS is defined in RFC 1034 and 1035.

The basic scheme of the use of DNS is the following: To map a name onto an IP address, an application program calls a library procedure called the resolver, passing it the name as a parameter. The resolver sends a UDP packet to a local DNS server, which then looks up the name and returns the IP address to the resolver, which then returns it to caller. Armed with the IP address, the program then establish a TCP connection with the destination, or send it UDP packets.

3.2.21. The DNS Name Space

Conceptually, the Internet is divided into several hundred top-level domains, where each domain covers many hosts. Each domain is partitioned into subdomains, and these are further partitioned, and so on. All these domains can be represented by a tree as in Fig. 7-25. The leaves of the tree represent domains that have no subdomains. A leaf domain may contain a single host, or it may represent a company and contains thousands of hosts.

Fig. 7-25. A portion of the Internet domain name space.

The top-level domains are of two kinds: generic and countries. The generic domains are com (commercial), edu (educational institutions), gov (the U.S. federal government), int (certain international organizations), mil (the U.S. armed forces), net (network providers), and org (nonprofit organizations).The country domains include one entry for every country, as defined in ISO 3166.

Each domain is named by the path upward from it to the (unnamed) root. The components are separated by periods (pronounced "dot"). Thus the Faculty of Mathematics and Physics of Comenius University is fmph.uniba.sk.

Domain names are case insensitive, so edu and EDU mean the same thing. Component names can be up to 63 characters long, and full path name must not exceed 255 characters.

In principle, domains can be inserted into the tree in two different ways. For example, cs.yale.edu could equally well be listed under the us country domain as cs.yale.ct.us. In practice, however, nearly all organizations in the U.S. are under a generic domain, and nearly all outside the U.S. are under the domain of their country. There is no rule against registering under two top-level domains, but doing so might be confusing, so few organizations do it.

Each domain controls how it allocates the domains under it. To create a new domain, permission is required of the domain in which it will be included. In this way, name conflicts are avoided. Once a new domain has been created and registered, it can create subdomains without getting permission from anybody higher up the tree.

Naming follows organizational boundaries, not physical networks.

3.2.22. Resource Records

Every domain can have a set of resource records associated with it. For a single host, the most common record is just its IP address, but many other kinds of resource records also exist. When a resolver gives a domain name to DNS, what it gets back are the resource records associated with that name. Thus the real function of DNS is to map domain names onto resource records.

A resource record is a five-tuple. Although they are encoded in binary, in most expositions resource records are presented in ASCII text, one line per resource record. The format we will use is as follows:

Domain_name Time_to_live Type Class Value

The Domain_name tells the domain to which this record applies. Normally, many records exist for each domain. When a query is made about a domain, all the matching records of the type requested are returned.

The Time_to_live field gives an indication of how stable the record is. Information that is highly stable is assigned a large value, such as 86400 (the number of seconds in 1 day). Information that are highly volatile is assigned a small value such as 60 (1 minute).

The Type field tells, what kind of record this is. The most important types are:

SOA - Start of Authority. Provides the name of the primary source of information about the name server's zone and some further information about it.
A - IP address of a host. It is the most important record type. It holds a 32-bit IP address for some host. If a host has more network connections, and so more IP addresses, it has a resource record for each of them.
MX - Mail exchange. It specifies the name of the host prepared to accept email for the specified domain.
NS - Name server. It specifies name servers. For example, every DNS database normally has an NS record for each of the top-level domains.
CNAME - Canonical name. This record allows aliases to be created.
PTR - Pointer. This is an allias for an IP address. Records of this type are nearly always used to associate a name with an IP address to allow lookups of the IP address and return the name of the corresponding machine. We omit the details of this process here.
HINFO - Host description. This record allow people to find out what kind of machine and operating system a domain corresponds to.
TXT - Text. This record allows domains to identify themselves in arbitrary ways.

The Class field is always equal IN for Internet information. For non-Internet information, other codes can be used.

The Value field can contain a number, a domain name, or an ASCII string. The semantics depends on the record type. A short description of the Value fields for each of the principal record types is given in Fig. 7-26.

Fig. 7-26. The principal DNS resource record types.

As an example of the kind of information one can find in the DNS database of a domain, see Fig. 7-27. This figure depicts part of a database for the cs.vu.nl domain shown in Fig. 7-25. The database contains seven types of resource records.

Fig. 7-27. A portion of a possible DNS database for cs.vu.nl

The first noncomment line of Fig. 7-27 gives some basic information about the domain, which will not concern us further.

The next two lines give textual information about where the domain is located.

Then come two entries giving the first and second places to try to deliver email sent to person@cs.vu.nl. The zephyr (a specific machine) should be tried first. If that fails, the top should be tried next.

Next 3 lines tell that the flits is a Sun workstation running UNIX and give both of its IP addresses.

Further three lines give choices for handling email sent to flits.cs.vu.nl.

Next comes an alias, www.cs.vu.nl, so this address can be used without designating a specific machine. Similarly ftp.cs.vu.nl.

The next four lines contain a typical entry for a workstation, in this case rowboat.cs.vu.nl. The information provided contains the IP address, the primary and secondary mail drops, and information about the machine. Then comes an entry for a non-UNIX system that is not capable of receiving mail itself, followed by an entry for a laser printer.

IP addresses for root servers needed to look up distant hosts are not in this file. They are present in a system configuration file loaded into the DNS cache when the server is booted. They have very long timeouts, so once loaded, they are never purged from the cache.

3.2.23. Name Servers

In practice, one single name server cannot contain the entire DNS database. So the DNS name space is divided up into nonoverlapping zones and each zone contains name servers holding the authoritative information about that zone (See Fig. 7-28 as a possible way how to divide up the name space from Fig. 7-25). Normally, a zone will have one primary name server, which gets its information from a file on its disk, and one or more secondary name servers, which get their information from the primary name server.

Fig. 7-28. Part of the DNS name space showing the division into zones.

When a resolver has a query about a domain name, it passes the query to one of the local name servers. If the domain being sought falls under the jurisdiction of the name server, it returns the authoritative resource records. An authoritative record is one that comes from the authority that manages the record, and thus is always correct. Authoritative records are in contrast with cached records, which may be out of date.

If, however, the domain is remote and no information about the requested domain is available locally, the name server sends a query message to the top-level name server for the domain requested. If it also does not know the answer, it sends it to one of its children, and so on. When a server with the authoritative resource record is encountered, the response is sent back through single name servers in the chain. For an example, see Fig. 7-29, here the IP address of the host linda.cs.yale.edu was sought by the resolver on flits.cs.vu.nl.

Fig. 7-29. How a resolver looks up a remote name in eight steps.

Once the record get back to the name server cs.vu.nl, it will be entered into a cache there, in case it is needed later. However, this information is not authoritative, so it should not live too long. This is the reason that the Time_to_live field is included in each resource record. It tells remote name servers how long to cache records.

The query method described here is known as a recursive query. An alternative form is also possible. In this form, when a query cannot be satisfied locally, the query fails, but the name of the next server on the line to try is returned. This procedure gives the client more control over the search process.

When a DNS client fails to get a response before its timer goes off, it normally will try another server next time.